Soft Clustering with Projections: PCA, ICA, and Laplacian

نویسندگان

  • David Gleich
  • Leonid Zhukov
چکیده

In this paper we present a comparison of three projection methods that use the eigenvectors of a matrix to investigate high-dimensional dataset: principal component analysis (PCA), principal component analysis followed by independent component analysis (PCA+ICA), and Laplacian projections. We demonstrate the application of these methods to a sponsored links search listings dataset and provide a comparison of the results both by examining the qualities of the projected dataset and looking at the topics represented by each soft cluster.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Topic Identification in Soft Clustering using PCA and ICA

Many applications can benefit from soft clustering, where each datum is assigned to multiple clusters with membership weights that sum to one. In this paper we present a comparison of principal component analysis (PCA) and independent component analysis (ICA) when used for soft clustering. We provide a short mathematical background for these methods and demonstrate their application to a sponso...

متن کامل

The small sample size problem of ICA: A comparative study and analysis

On the small sample size problems such as appearance-based recognition, empirical studies have shown that ICA projections have trivial effect on improving the recognition performance over whitened PCA. However, what causes the ineffectiveness of ICA is still an open question. In this study, we find out that this small sample size problem of ICA is caused by a special distributional phenomenon o...

متن کامل

Sparse ICA via cluster-wise PCA

In this paper, it is shown that independent component analysis (ICA) of sparse signals (sparse ICA) can be seen as a cluster-wise principal component analysis (PCA). Consequently, Sparse ICA may be done by a combination of a clustering algorithm and PCA. For the clustering part, we use, in this paper, an algorithm inspired from K-means. The final algorithm is easy to implement for any number of...

متن کامل

Comparison of MLP NN Approach with PCA and ICA for Extraction of Hidden Regulatory Signals in Biological Networks

The biologists now face with the masses of high dimensional datasets generated from various high-throughput technologies, which are outputs of complex inter-connected biological networks at different levels driven by a number of hidden regulatory signals. So far, many computational and statistical methods such as PCA and ICA have been employed for computing low-dimensional or hidden represe...

متن کامل

The Independent and Principal Component of Graph Spectra

In this paper, we demonstrate how PCA and ICA can be used for embedding graphs in pattern-spaces. Graph spectral feature vectors are calculated from the leading eigenvalues and eigenvectors of the unweighted graph adjacency matrix. The vectors are then embedded in a lower dimensional pattern space using both the PCA and ICA decomposition methods. Synthetic and real sequences are tested using th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005